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Abstract 

Coordination games describe social or economic interactions in which the adoption of a com- 
mon strategy has a higher payoff. They are classically used to model the spread of conventions, 
behaviors, and technologies in societies. Here we consider a two-strategies coordination game 
played asynchronously between the nodes of a network. Agents behave according to a noisy 
best-response dynamics. 

It is known that noise removes the degeneracy among equilibria: In the long run, the "risk- 
dominant" behavior spreads throughout the network. Here we consider the problem of computing 
the typical time scale for the spread of this behavior. In particular, we study its dependence on 
the network structure and derive a dichotomy between highly-connected, non-local graphs that 
show slow convergence, and poorly connected, low dimensional graphs that show fast convergence. 
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1 Introduction 



The unprecedented growth of onhne social network and their increasing role in the spread of knowl- 
edge, behaviors and new technologies have given rise to a wealth of interesting questions. Is it 
possible to explain the emergence of a new phenomenon based on the dynamics of the interaction 
among individuals |Klein07l Young93 ? 
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As an example consider a two-dimensional grid and assume that 
each node adopts the new behavior (call it +1, the alternative being 
— 1) if at least two of its neighbors have already adopted it. It is then 
easy to see that no finite set of +l's can influence the whole grid, 
and in fact the influence of any finite set of +l's is limited to the 
smallest rectangle that circumscribes them. For instance, the group 
of black nodes in the figure on the right does not expand further. 

Now, consider the same dynamics with a small noise, i.e. assume 
that, with some small probability e, agents do not follow the pre- ^ ^ ^ ^ ^ >^ ^ 
established rule. This can have dramatic effects. If the gray node in J | \ J T T T 
the figure switches to +1 by mistake, then a new layer may be added 

to the group of black nodes at no extra (probability) cost. Of course, the reverse can happen: the 
block of -f-l's can be eroded because of noise. However if the initial block is large enough (and under 
some technical assumptions) the former mechanism will prevail [NeS9H[NeS92| . The important point 
is that 'large enough' means here larger than some constant quantity, and that influence spreads at 
some positive velocity. This phenomenon was first discovered in statistical physics, under the name 
of 'nucleation' and received an intense attention in the mathematical physics literature over the last 
30 years |UV04[ lBw03] . 

Similar models were developed independently within the context of evolutionary game theory. 
For example consider a simple game in which every individual placed in a network has to make 
a decision between two alternatives. The payoff of an action for each person is proportional to 
the number of its neighbors who are taking the same action. These games, known as coordination 
games, have been studied extensively for modeling the emergence of technologies and social norms 



Young93t IMorrOOl IKlein07| IBlu93j . The main conclusion of this line of work is that adding a small 



random perturbation to best response dynamics creates an evolutionary force that drives the system 
towards a particular equilibrium in which all players take the same action. 

In real-world networks stochasticity is unavoidable. As a consequence, we can expect the players 
to eventually achieve coordination on a particular equilibrium, irrespective of the initial state. The 
present paper characterizes the rate of convergence for such dynamics in terms of explicit graph 
quantities. It thus provide the first step in a longer term program aimed at developing approximation 
algorithms to estimate convergence to Nash equilibria. 

Our characterization is expressed in terms of tilted cutwidth and tilted cut of the graph that are 
dual quantities. The former provides a path to the equilibrium that gives an upper bound on the 
converge time. The latter corresponds to a bottleneck along the highest separating set in the space 
of configurations. We show that tilted cut and tilted cutwidth coincide for the 'slowest' subgraph 
and the convergence time is exponential in this graph parameter. 

The proof uses an argument similar to [DV76t [DSC93t [TS89j to relate hitting time to the spectrum 
of an appropriate transition kernel. The convergence time is then estimated in terms of the most 
likely path from the worst-case initial configuration. It turns out that the most likely path is the 
one that implies the lowest decrease of probability in stationary measure. A delicate argument using 
the submodularity of the potential function shows that there exists a monotone increasing path with 
this property. In order to prove the characterization in terms of tilted cut we study the 'slowest' 



eigenvector and show that it is monotone using a fixed point argument. We then approximate the 
eigenvector with a characteristic function. 

The above result allows us to estimate the convergence time for specific graphs through their 
isoperimetric function. For example in interaction graphs that can be embedded in low dimensional 
spaces, the dynamics converges in a very short time. On the other hand, for a wide class of bounded 
degree graphs such as random regular graphs or certain small-world networks the convergence may 
take as long as exponential in the number of nodes. 

Related work 

There is a very interesting line of work in mathematical physics leading to very sharp estimates of 
the convergence times of specific models: mainly two and three dimensional grids |BC961 IBM02] . 
Berger et al. |BK+05| compute the mixing time of a similar dynamics in terms of cutwidth of the 
graph using different techniques from the current paper. 

In the game theory literature, one of the criticisms of Nash equilibria is that its multiplicity makes 
it hard to predict the outcome of a play. How do players learn to play a specific equilibrium, and 
which one do they select? For example, the grid graph described above shows that the coordination 
game can have several equilibria. There is a vast literature in evolutionary game theory for resolving 
this problem especially in the context of coordination games |KMR93l Young93 IE11931 IBlu931 lF"L98j . 



The importance of estimating convergence times was first stressed in the pioneering work of 
Ellison [E1193 ] . He argued that the long-run equilibrium is relevant only if the convergence time 
is reasonably small. Ellison studied the rate of convergence for two extreme interaction graphs: a 
complete graph and a graph obtained by placing individuals on a cycle and connecting all pairs of 
distance smaller than some given range. He showed that the dynamics converges very slowly for the 
former model and very quickly for the latter. Based on this observation, he concluded that when 
the interaction is global the outcome is determined by historic factors. In contrast, when players 
"interact with small sets of neighbors," we can assume that evolutionary forces may determine the 
outcome. 

Our result implies that the key property of the network that captures the rate of convergence is 
not the number of nodes each agent interacts with, or the number of edges of the graph. This can be 
proved for a large class of (non-reversible) noisy best-response dynamics including the one of |E1193] . 



2 Definitions 

A game is played in periods t = 1,2,3, .. . among a set V of players. Each player i has two 
alternative strategies as Xi £ {+1, — !}• Let x = {xi : i £ V}. The payoff matrix ^ is a 2 x 2- matrix 
illustrated in the figure. The players interact on an undirected graph G = (y,E). The payoff of 
player i is X^jg^j A{xi,Xj), where di is the set of neighbors of vertex i. 

The payoff matrix A defines a coordination game which means a > d and 
b > c. It is easy to verify that for every i, the best response strategy is sign(/ij + 
YljfzgiXj), where hi = fr^^^rf l^^^l = Pld^l^ with \di\ the degree of node i. We 
assume that a — b > d — c, so that hi > for all i £ V of non-vanishing degree. 
Harsanyi and Selten [HS88j named + the "risk-dominant" equilibrium, as it 
minimizes the utility loss due to a change in the opponent strategy. Notice that 
this does not coincide, in general, with the payoff dominant equilibrium. 

Noisy best-response dynamics is specified by a one-parameter family of Markov chains • • • } 
indexed by p. The parameter /3 G 1R+ determines how noisy is the dynamics, with /3 = +oo 
corresponding to the noise-free case. Two type of updates are naturally defined: 



a, a 


c, d 


d, c 


b,b 



(1) Synchronous updates. At each step of the chain, each player draws a new strategy yi conditionally 
on its neighbor's strategies xqi at the previous time step. The conditional distribution is denoted by 

(2) Asynchronous updates. Each node i updates its value at the arrival time of an independent Poisson 
clock of rate 1. The conditional distribution of the new strategy is again denoted as Pi,p{yi\xQi)- 

The dynamics of [E1193| is recovered by the following transition probabilities. Let y* = sign(/ij + 
T.j&di^i)- Then for every player i, Pi,f3{y*\xgi) = l-e-^ and = e"^. 

A considerable simplification is achieved for the so-called heath bath or Glauber kernel 

PiAy^k^^) = {^ + e-'^''^^^^y^y' (i) 

where Ki{x) = hi + ^ji^giXj. . This is also known as logit update rule which is standard in the 
discrete choice literature [M74| . It has also been used to model subjects' empirical choice behavior 
in laboratory situations |MS94t IMP95| . In this context it has been studied by Blume |Blu93j . 
The corresponding Markov chain is reversible with respect to the stationary distribution ^i3{x) oc 
exp(— with 

H{x) = - ^ XiXj - ^ hiXi , (2) 

in the case of asynchronous dynamics. This is the energy function of the Ising model; an analogous 
expression can be written for synchronous updates. In both the above models the stationary distri- 
bution for large (3 concentrates around the all-(-l-l) configuration. In other words, these dynamics 
predict that in the long run, the play will converge to the risk-dominant equilibrium. 

In the following we will often adopt the equivalent representation of configurations as subsets of 
vertices 5" C y, whereby z G S if and only if Xi = +1, and, with a slight abuse of notation, we shall 
denote by H{S) the corresponding energy. If \S\h = Ylies then H{S) — H{%) = 2cut(S', V \S) — 
2\S\h. It is important to notice that H{ ■ ) is submodular. 

Our aim is to determine whether this prediction is realized in a reasonable time. To this end, we 
let T+ denote the hitting time to the all-(-l-l) configuration, and define the typical hitting time for 
+1 as 

T+{G;h) = sup inf |t > : P|{r+ > t} < e"^} . (3) 
For the sake of brevity, we will often refer to this as the hitting time, and drop its arguments. 

3 Main results 

Our first step is to express the large-/? (low-noise) behavior of t+(G; /i) in terms of graph-theoretical 
quantities. Let n = \V\ be the number of players. Given h = {hi : i £ V}, and U C. V, we let 
\U\h = J2ieU ^i- define the tilted cutwidth of G as 

T{G;h) = min max [cut{St, V\St)- \St\h] . (4) 

Here the min is taken over all linear orderings of the vertices ^(1), . . . , i{n), with St = {«(!), . . . , i{t)}. 
Note that if for all i, hi = 0, the above is equal to the cutwidth of the graph. 

Given a collection of subsets oi V, Q CI 2^ such that ^ G U, V ^ Q, we let dQ he the collection 
of couples {S, S U {i}) such that 5 G and S U {i} il. We then define the tilted cut of G as 

A(G:/i) = max min max [cutfSj, F \ Sj) — IS'jLl , (5) 
n {Si,S2)&dn 1=1,2 



the maximum being taken over monotone sets Q (i.e. such that 5 G Q imphes S" E Q for all S' Q S). 
It thus coincide 

It is known that, in the case hi = 0, the mixing time of Glauber dynamics is at most exponential 
in the cutwidth of G |BK+05| . The following result provides a generalization to the case hi > of 
interest here, in the limit of large /3. Since r(G;^) (as well as A{G;h)) is decreasing in h, the upper 
bound is smaller than the one for the hi = case. 

Theorem 3.1. Given an induced subgraph F (1 G, let h^ be defined by hf = hi + \di\G\F7 where 
\dAG\F degree of i in G \ F. For reversible asynchronous dynamics we have T^{G;h) = 

exp{2/3r*(G; /i) + o(/3)}, where 

TJG;h) = maxr(F;/i^) = maxA(F;/i^) . (6) 
FCG ~ FCG ^ - ' 

Note that tilted cutwidth and tilted cut are dual quantities. The former corresponds the maximal 
energy height along the lowest path to the + equilibrium. The latter is the lowest energy along the 
highest separating set in the space of configurations. A natural strategy for estimating T^{G;h) 
consists in lower bounding A{F;h^) by exhibiting a monotone set C 2^^^\ and upper bounding 
T{F;h^) by exhibiting a linear ordering of V{F). The above theorem shows that tilted cut and 
cutwidth coincide for the 'slowest' subgraph of G and if the /ij's are non-negative. The hitting time 
is exponential in this graph parameter. 

The two characterizations above are exact but it is highly non-trivial to compute them. In the 
rest of this section, we will show how the above theorem implies the known results for special classes 
of graphs. Then, we relate tilted cutwidth to graph expansion and derive a dichotomy between the 
hitting time on expanders versus locally connected graphs. In the end, we show how to use algorithms 
for sparsest cuts to find the approximately optimal linear ordering as defined in tilted cutwidth. 

The cases treated by Ellison are easily understood within the present framework. In order to 
derive a lower bound for the complete graph, with hi = h for all i (^V, one can restrict attention to 
F = G and for that graph define to be the family of all sets with cardinality at most n/2. 

T*{Kn;h)> min [cut{S,V \ S) - \S\h] = {n - h)^ /4 + 0{n) . (7) 

|S'|=n/2 

The second example studied by Ellison is a 2/c-regular graph resulting from connecting all vertices 
of distance at most /c in a cycle. In that graph, the maximum is again achieved for F = G, and the 
natural linear ordering of the cycle yields T{G; h) < Ak"^. 



It is also straightforward to recover the result of Young Young95 from the above theorem. 
Indeed, the hypotheses of |Young95] are equivalent to the existence of a sequence Si, . . . ,St ^ V 
such that H{St) = min5/c5t H{S') < and \Si\ < k. By flipping vertices along this sequence and 
using the submodularity of H{-), it follows that T{F;h^) < k"^. 



3.1 Relation to graph expansion 

The following Lemma links the isoperimetric function of G (and its subgraphs) to the hitting time. 
It is particularly useful when analyzing specific graph families. 

Lemma 3.2. For G E define J{e) = [9- hm^^,9 + h 

max]- Assume that there exist constants a 
and 7 < 1 such that for any subset of vertices U (^V , and any 9 such that there exists S* C J7 with 
\S\h G J{9), we have 



cut{S,U\S) <a\Sp , 



(8) 



for at least one such S. Then T^{G;h) < A{a,'j, ^max) ^mln max(2, ^"^^t^)- 

Conversely, assume there exists U C V{G), such that for i ^ U , \di (V \ U)\ < b, and the 
subgraph induced by U is a {S, A) expander. Then r^,(G; /i) > (A — /imax — b) IS\U\\ . 

In words, the hitting time is dominated by highly connected subgraphs of G, that are loosely tied 
to the rest of the graph. On the other hand, an upper bound on the isoperimetric function leads to 
upper bounds on the hitting time. 

In order to gain some intuition we consider a few interesting graph models: 

(a) Finite-range d- dimensional networks. The graph G is a d-dimensional range--ftr network if 
we can associate to each of its vertices i G V a position Xi G R"^ such that, (1) whenever 

E E, d^uciixi, Xj) < K (here d^^^i{---) denotes Euclidean distance); (2) Any cube of 
volume V contains at most 2v vertices. We will also say that G is embeddable in this case. 

(b) Small world networks. Again, the vertices are those of a d-dimensional grid of side n^^'^. Two 
vertices j are connected by an edge if they are nearest neighbors. Further, each vertex i is 
connected to k other vertices j{k) drawn independently with distribution Pi{j) = 
C{n)\i-j\-\ 

(c) Random regular graphs of degree k. 

Theorem 3.3. The following statements hold with high probability: 

If G is a d-dimensional finite-range graph, and /imm > 0, then T^{G;h) = 0(1). 

If G is a small world network with r > d, andh^iax < k—d—5/2, thenT:^{G; h) = r2(logn/loglogn). 

If G is a small world network with r < d, and /imax is small enough, then r*(G;/i) = r2(n). 

If G is a random k-regular graph, and /imax < k — 2, then r*(G;/i) = J7(n). 

These qualitatively distinct behaviors correspond to different mechanisms by which consensus 
spreads in these networks. In finite-range networks, the process is initiated in a relatively compact 
region taking value +1. If this is large enough (which happens with positive probability), it spreads 
through the whole graph. This is possible because of the bias provided by /imin > 0. Indeed the 
proof of this statement implies an upper bound of the form T{G;h) = 0{hj^^ log(l//imin))- 

In small- world networks with r > d the process is similar, but the spread of +l's is blocked in 
its very last stages by small, highly connected regions of size roughly (logn). Finally, small- world 
networks with r < d and random regular graphs are expanders and convergence is extremely slow. 

All the above statements take the form of a tradeoff between how 'well-connected' is G and how 
biased is the dynamics (the latter being measured by /imin)- In the case of well-connected graphs 
it is not hard to prove upper bounds on r*(G;/i) for large enough h. For instance, in the case of 
/c-regular graphs T^:{G;h) = 0(1) if /imm > k. 



3.2 Approximating tilted cut and tilted cutwidth 

The maximization over ^1 in Eq. ^ for computing tilted cut is highly non-trivial. Here we obtain a 
class of lower bounds by restricting 0, to essentially subsets with a given cardinality. The following 
result shows the 'loss' resulting from this restriction is bounded, under appropriate conditions. On 
the other hand, it implies that algorithms for computing sparse cuts find approximately optimal 
orderings corresponding to a tilted cutwidth. 

Theorem 3.4. Assume that, for some Li,L2, with L2 > /imax cind for every induced subgraph 
F C G, we have 

min [cut(5, ViF) \ S) - \S\,.] < L^, (9) 

|S'|hG[Li,L2] 



where it is understood that 7^ S* C V{F). If, for every subset of vertices U , with \U\h < L2, the 
induced subgraph has cutwidth upper hounded by C , then T[G] 4/i) < C + Li + L2. 

It is interesting to compare this result with the analysis of contagion models [MorrOOj . In that 
case contagion takes place if there exists an ordering of the vertices i(l), i(2), . . . such that, assuming 
^i(i) = +1) ^i(2) = +!)• • • Xi{t) = +1) the best response for i{t + l) is strategy +1. Theorem 13 . 41 allows 
to replace single vertices, by 'blocks' as long as they have bounded size and bounded cutwidth. 

Assuming that a 'good' path to consensus exists, can it be found efficiently? By using a simple 
generalization of Feige and Krauthgamer 's [FK02] 0{\og^n) approximation algorithm for finding the 
sparsest cut of a given cardinality, we have the following 

Remark 3.5. If G = iV^E) satisfies equation it is possible to find an ordering ii,i2, ■ ■ ■ ,in of 
V in polynomial time so that for every St = {ii,i2, ■ ■ ■ it}, o-nd L = Li + L2 + C 

cni{SuV\St) = 0{\St\h log' n + Llogn). 
3.3 Nonreversible and synchronous dynamics 

In this section we consider a general class of Markov dynamics over x_ G {+1, —1}^- An element 
in this class is specified by Pi^piyilxg^), with a non-decreasing function of the number 

Y^j^QiXj. Further we assume that < e~^^ when hi + Ylij&di ^ ^- Note that the 

synchronous Markov chain studied in KMR |KMR93| and Ellison |E1193j is a special case in this 
class. 

Denote the hitting time of all (+l)-configuration in graph G with tj^[G) as before. 

Proposition 3.6. Let GiV, E) be a k-regular graph of size n such that for A, 5 > 0, every S C 
V, \S\ < 5n has vertex expansion at least A. Then for any noisy-best response dynamics defined 
above, there exists a constant c = c(A,5, k) such that T^{G;h) > exp{/3cn} as long as 

3k max,- hi 

Note that random regular graphs satisfy the condition of the above proposition as long as /ij's are 
small enough. The proof of the proposition is by simply considering the evolution of one dimensional 
chain indicating the number of +1 vertices. 

Proposition 3.7. Let G be a d-dimensional grid of size n and constant d > 1. For any syn- 
chronous or asynchronous noisy-best response dynamics defined above, there exists constant c such 
that T+{G]h) < exp{/3c}. 

The above proposition can be proved by a simple coupling argument very similar to that of 
Young |Young93j . We will leave its details to a more complete version of the paper. The above two 
propositions show that for a large class of noisy best-response dynamics including the one considered 
in |E1193] ■ the degrees of vertices are not the key property dictating the rate of convergence. 

4 Proofs 

4.1 Theorem [SH] 

It is a basic result in the theory of reversible Markov chains with exponentially small transition rates, 
that hitting time are related to 'energy barriers.' 



Lemma 4.1. Consider a Markov chain with state space S reversible with respect to the stationary 
measure H/s^x) = ex.p{—f3H{x) + o(/3)), and assume that, if pp{x,y) = exp(— /3y(x, y) + o{j3)). 

Let A = {x : H{x) < Hq} be non-empty, and define the typical hitting time for A as in Eq. 
with + replaced by A. Then ta = exp{/?r^ + o{P)} where 

= max mill max [H(uJt) + y{^t,'-^t+i) — H(z)] , (10) 

and the min runs over paths uj = {uJi,U2-, ■ ■ ■ ,wr) in configuration space such that pi3{L0t,uJt+i) > 
for each t. 

The proof can be obtained by building on known results, for instance Theorem 6.38 in |OV04j . 
These however typically apply to exit times from local minima of H[x). We provide a simple proof 
based on spectral arguments in Appendix [Bl 

For the sake of clarity, we split the proof of Theorem 13. II in two parts: first the characterization 
in terms of tilted cutwidth (i.e. the first identity in Eq. ([6|)); then the one in terms of tilted cut 
(second identity in Eq. ([6])). 

Proof. (Theorem 13.11 Tilted cutwidth). Notice that Glauber dynamics satisfies the hypotheses of 
Lemma l4.H with H{x) = H{x) given by Eq. ([2]). In this case, for any allowed transition x — > y' , 
H(x) + V{x,y) = maL,x(H (x) , H (y)) . As a consequence, we can drop the factor V{- • • ) in Eq. ([TO]) . 

We thus obtain t-|_ = exp(/3 max^ r+(z) + o(/?)) where 

r+(z)= min max [H{uJt) - H{z)] . (11) 

An upper bound is obtained by restricting the minimum to monotone paths. It is not hard to realize 
that the result coincides with 2T{F;h^) where F is the subgraph induced by vertices i such that 
Zi = —1. It is far less obvious that the optimal path can indeed be taken to be monotone. 

It is convenient to use the representation of the path lo = (xg = z,Xi,... ,x^^\-i = +1 ) as a 
sequence of subsets of vertices: m = (Sq = S,Si, . . . , = V). We will consider a more general 

class of paths whereby St \ St-i = {v} or St C St-i, and let G{lo) = maxt[H{St) — H{So)]. 

Let us start by considering the optimal initial configuration We claim that if E arg max^ min^-s^v G{lo) 
is such an optimal configuration, then for every A C B, H{A) > H{B). Indeed, suppose H{A) < 
H{B). By prepending B to any path lo : A , we obtain a path uj' : B ^ V with G{uj') < G{uj). 
Therefore min^i-B^v G{oj') < min^^-A^v G{uj) which is a contradiction. 

Among all paths that achieve the optimum, choose the path lo that minimizes the potential 
function f{Lo) = \lo\'^\V\ — X^g.g^^ \Si\. Intuitively, / puts a very high weight on shorter paths and 
then paths with larger sets. We will prove that, with this choice, lo is monotone. 

For the sake of contradiction, suppose u) is not monotone. Let Sk be the set with the smallest index 
such that Sk+i C Sk- Partition Sk\Sk+i into two subsets R = (/S'fc\S'fc+i)nS'o and T = {Sk\Sk+i)\So. 
Without loss of generality assume that for 1 < i < k, Si = {1, 2, - ■ ■ i} U Sq. Let vi < V2 ■ ■ ■ < vt he 
the elements of T in the order of their appearance in lo. 

For a subset A C T, and i < k define the marginal value of subset A at position i to be 
M{A,i) = H{Si \A) — H{Si). Since H is submodular, M{A,i) is non-decreasing with i as long as 
A d Si. Because of our claim about the initial condition, we have, in particular, 

M{R,0)=H{So)-H{So\R)>0. (12) 



The crucial lemma below is proved in Appendix O 



Lemma 4.2. One of the following two statements is correct: Case (I) There exists a subset T' C T 
such that for all i, M{T', i) < 0; Case (II) M{T UR,k)> 0. 

We are now ready to finish the proof. Suppose the first statement of the lemma is correct. 
We construct a new path u' by removing the vertices of T' from the sequence 1,2,- ■■ ,t in the 
beginning of to and also removing T' from T. Since uj' is shorter than uj, we only need to argue that 
G{uj') < G{uj). This is obvious because for every i < k, H{Si \ T') - H{Si) = M{T',i) < 0. 

In the second case, we construct another path by changing Sk+i- First note that since to is 
minimizing the potential function, Sk+2 = 'S'fc+i U {v} for some v that is not in Sk- Now note that 
by replacing Sk+i with Sk U {v} we obtain a path with a higher value of the potential function and 
at most the same barrier. This is because 

H{Sk+i U {v}) - H{Sk U {v}) > H{Sk+i) - H{Sk) = M{T UR,k)>0. (13) 

□ 



The second part of the proof exploits the well known fact that Glauber dynamics is monotone 
for the Ising model. Given initial conditions x(0) and x'(0) >z x{0), the corresponding evolutions can 
be coupled in such a way that x'{t) ^ x{t) after any number of steps. 

Proof. (Theorem 13.11 Tilted cut). By monotonicity of Glauber dynamics T^,{G;h) > T^{F;h^) for 
any induced subgraph F C G. Theorem 14.11 implies r*(F;/i^) > A{F;h^): indeed given a path u = 
{So, Si, . . . , = V) this must have at least one step in dO,. Hence T^{G;h) > maxp A{F;h^). 

We need to prove r*(G;/i) < A{F;h^) for at least one induced subgraph F. Fix F to be a 
subgraph which achieves the maximum in Eq. ([6]) (i.e. argmaxr(F; /i^)). Notice that, to leading 
exponential order, the hitting time in F is the same as in G, i.e. T,{F;l/) = T,{G;h). 

Let Pf3{x,y) be the transition probabilities of Glauber dynamics on F, and p^(x, y) the kernel 
restricted to {+1, — 1}^(^) \ |+1|. By this we mean that we set ptjx, +1) = pj^ (+1, y) = 0. Denote 
by Pp the matrix with entries p^{x, y) and by ijjQ its eigenvector with largest eigenvalue. By Perron- 
Frobenius Theorem, we can assume V'o(3i) > 0. We claim that "00 (is) is monotonically decreasing in 
x. Indeed consider the transformation -0 i— > T('i/^) = P^ %() /\\Pp Tp\\2^^. This is a continuous mapping 
from the set of unit vectors in L'^{^j) onto itself. Further, if ip is monotone and non-negative, T{il}) 
is monotone an non-negative as well (the first property follows from monotonicity of the dynamics). 
The set of non-negative and monotone unit vectors in i^^(/u) is homeomorphic to a simplex. By 
Brouwer fixed point theorem, T has at least one fixed point that is non-negative and monotone, 
which therefore coincides with •(/^o by Perron-Frobenius. 

Lemmas IB. II and IE. II implv that there exists O = {x G 5 : il^o{x) > h}, such that 

T^{F;h^) < C„(l + /3) E.gn M^) _ ^^^^ 

for some /3-independent constant C^. Using T^{F;h^) = ex]){2pr ^{F;h^) + o(/?)} and the large /3 
asymptotics of /i(x), p'^{x,y) we get 

T^{F;l/)< min max[cut{Si,V \ Si) - \Si\h] + 0fs{l) . (15) 

{Si,S2)(:dQ 1=1,2 



Since il^o{x) is monotone, is monotone as well and therefore the last inequality implies the thesis. 
□ 



4.2 Theorem Q 



Proof. (Lemma 13. 2p . By Theorem 13.11 it is sufficient to find an upper bound for T(F;h^) for every 
induced subgraph F. By monotonicity of T{F;h) with respect to h, T{F;h^) < T{F;h). We will 
upper bound T{F;h) by showing Eq. ([9]) holds for any induced subgraph F Q F. 

First notice that, for any U and for any 9, there exists S C U such that \S\h G J{9) and 

cut(5,C/\5) - < ahJJSll - ^\S\h < A'{a,j)h;2l^'-^^ > (16) 

where A'{a,j) = max(aj;''' — x/4 : x > 0). Take Li = A'{a,^) h^l^^ "''^ and L2 = Li + 2/imax- By 
Eq. (HI]) 

min 

|5U6[Li,L2] 

Finally the cutwidth of any set S with \S\h < L2 is upper bounded by ajS'p log IS"! (using |LR99j 
and Eq. dS])) which is at most C = A" (a, 7, /imax) ^m!n^^ logmax(2, /ij^J^t^). The thesis thus follows 
by applying Theorem 13. 4[ 

To prove the lower bound we use Theorem 13.11 again. Let F be the subgraph induced by U. By 
monotonicity of A{G;h) with respect to h, for t = [6\U\\, we have 

A{F;jf)>AiF;h^,, + k) > min [A|5| — (/imax + • 

\S\=t 

which implies the thesis. □ 



cntiS,V{F)\S)--\S\H 



< U 



We notice in passing that the estimates in the second part of this proof could be improved by 
using more specific arguments instead of directly applying Theorem 13.11 

For the proof of theorem 13.31 we need to estimate the isoperimetric function of finite range 
d-dimensional graphs. This can be done by an appropriate relaxation. 

Given a function / : 1/ — > E,, z 1— > /j, and a set of non-negative weights tUj, i S y, we define 

i&V {i,j)&E 

We then have the following generalization of Cheeger inequality. 

Lemma 4.3. assume there exists two vertex sets Qi C Qq CV and a function f : V ^ R such that: 
(1) fi > \ fj\ for any i G and any j G V; (2) fi = OforieV\ Qq; (3) Li < < \Qo\u, < L2; 

(4) ||Vg/|P < M\f\\l- Then there exists 5 C F with Li < \SU < L2 

cut(S, V\S)< JAX max{|ai|//ij \S\h . (18) 

V i&V 

The proof of this Lemma is deferred to Appendix 1X1 

Proof. (Theorem 13. 3p Finite-range d dimensional networks. We need to prove that, for each induced 
subgraph G', T{G';h^ ) = 0(1). By Theorem 13. 4| it is sufficient to show that, for any induced and 
connected subgraph F, there exists a set S of bounded size such that cut{S,V{F)\S) - l\S\^f,^F < 0, 
with h'^ = hi/ 4:. If the original graph is embeddable, any induced subgraph is embeddable as well. 
Since hf > hi, the thesis follows by proving that for any embeddable graph G, we can find a set of 
vertices S of bounded size with cut(5, V \ S) < |<S'|/j/4. 



We will construct a function / with bounded support such that ||Vg/|| ^ '^ll/ll "^i^h A = 
min,;gy{ ig | gj| }■ In order to achieve this goal, consider the d-dimensional of G and partition IR*^ 
in cubes C of side i to be fixed later. Denote by Cq the cube maximizing X^ra^GC^*' ^i' 
j = 1, . . . 3*^ — 1 be the adjacent cubes. Let fi = (p{xi), where for x G M!^, we have 



(fix) 



(19) 



Notice that \V^p{x)\ < l/l and \Vip{x)\ > only if x € Cj, j = 1,...3'^"^ Since \fi - fj\ < 
\Vip\ \\xi — XjW we have 



^ ^ iev V / * 

< 3^(j) rrm^{\di\/hi}Y,hil[x,eCo)<3''(j) um^m/hMfWl- (20) 

The thesis follows by choosing £ = m.aK.i^Y{\di\/hi}. 

Small world networks with r > d. Let C/ be a subset of vertices forming a cube of side and Gu 
a {e,k — 5/2), /c-regular expander with vertex set U. Such a graph exists for all £ large enough and 
e small enough by [Kah92] . Call Ajj the event that the subgraph induced by long-range edges in U 
coincides with Gjj, and no long-range edge from i (^V \ U is incident on U. 

Under Ajj, the subgraph Gjj satisfies the hypotheses of Lemma [3.2^ second part, with b = d. 
Therefore T^{G;h) > {k - 5/2 - Vax - d)[e£'^/4:\. The thesis thus follows if we can prove the 
existence of U with volume £'^ = r2(logn/ loglogn) such that Ajj is true. 

Fix one such cube U. The probability that the long range edges inside U induce the expander 
Gu is larger than (C(n)^~^)'^^ . On the other hand, for any vertex i € U, the probability that no 
long range edge from F \ [/ is incident on U is lower bounded as 

n [l-C{n)\i-jr]''>exp{-3kGin) ^ \i - jr} 
jev\i jev\i 

where we used the lower bound 1 — x > e~^^ valid for all x < 1/2, together with the fact that 
G{n) < l/2d (which follows by considering the 2d nearest neighbors). From the definition of C{n), 
the last expression is lower bounded by e~^^ , whence 

P{^c/} > [G{n)e-'^£-']^''' . 

Let S denote a family of {n/£'^) disjoint subcubes, and denote by Ns the number of such subcubes 
for which property Au holds. Then E[iV5] = {n / £'^)^ {Ajj^ . Using the above lower bound together 
with the fact C(n) > G^^d > for r > d and G{n) > C^,^c;/logn for r = d, it follows that there exists 
a,b> such that E[A^s] = Q(n") if ell"^ < ?)logn/ loglogn. 

The proof if finished by noticing that, for U n U' =, T{Au n Au/} < F{Au n Au>}, whence 
Var(A'^S') ^ IE[A'^s']- The thesis follows applying Chebyshev inequality to Ns- 

Small world networks with r < d. It is proved in [Fla06] that these graphs are with high proba- 
bility expanders. The thesis follows from Lemma |3.2[ 

Random regular graphs. It is well known that a random /c-regular graph is with high probability 
a k — 2 — 6 expander for all J > |Kah92) . The thesis follows again from Lemma 13.21 □ 
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A Proof of Lemma 14.31 

Assume without loss of generality that max{|/j| : i € V} = 1, whence fi = l for z € We use 
the same trick as in the proof of the standard Cheeger inequality 




(21) 



The denominator is upper bounded by 




\di 



ml- 



(22) 



The argument in parenthesis at the numerator is instead equal to 




(23) 



where Sz = {i £ V : f1 > z}. The quantity above is lower bounded by 




(24) 



Let S = Sz^ where realizes the above minimum (the function to be minimized is piecewise constants 
and right continuous hence the minimum is reaUzed at some point). Notice that r^i C 5^ C Jig for 
ah z G [0, 1], and thus we have in particular Li < \S\w < L2. Further, form the above 



A > 



I|Vg/II^ 
2 

h 



> — min 
- 4 



\di\ 



cut(g, V\S) 
\S\h 



which finishes the proof. 



(25) 
□ 



B Hitting times at low temperature: proof of Lemma 14.11 

We consider a general setting of Lemma 14.11 a discrete time Markov chain with state space S, 
transition probabilities pp{x,y), reversible with respect to the stationary distribution /x(x). Given 
Acs define p^{x,y) = Pf3{x,y) if x,y & S \A and p'^{x,y) = otherwise. Notice by reversibility 
the eigenvalues of p^ are real, and smaller than 1. We assume that p^ is irreducible and aperiodic. 

The lower bound in the next lemma is due to Donsker and Varadhan [DV76| : we nevertheless 
propose an elementary proof. 

Lemma B.l. If 1 — Ao,a is the largest eigenvalue ofp^, then 

1 1 r 1 1 1 

log(l/(l - Xo,a)) - - log(l/(l - Xo,a)) I ^2™lvi°^M^J ■ 

Proof. Let Pa denote the matrix with entries p'^{x,y), and f{x) be the characteristic function of 
S\A. Then {Ta > t} = P^fix), whence 



v^PxITa > n < ^^/i(x)p,.{rA > = ||pA/|U,2 < (1 - XoaY , 

which proves the upper bound. To prove the lower bound, let ^l^oix) denote the eigenvector of 
Pa, with eigenvalue Ao,a and notice that by Perron-Frobenius theorem, it has non-negative entries. 
Therefore 

maxF,{TA>t} {^po, f)^, >J2 f'(^)Mx)PATA > t} = {I - Xo,A)\i^o, f) ■ 

X 

□ 



Proof. (Lemma 14. ip . Due to Lemma iB.li it is sufficient to prove that Ao,a = exp{— /JP^ + o(/3)}. To 
this end we use the well known variational characterization of eigenvalues 

"^"'^ " E^flaj ' Dir(vj) = ^Yl f^(^)Pf^(^^ y){^ix) - ip{y)f . (26) 

Here the inf is taken over functions non- vanishing functions 99 : 5 \ vl — > IR. 

A lower bound can be obtained by comparison. More precisely, for each z & S \ A, let iv^^^ be 
a path or allowed transition from z to A. Proceeding along the lines of |JS891 IDSC93j . one obtains 



that Ao,A ^1/ ^^^x,y C{x, y] 1^)1 where, for each allowed transition x — > y, we defined the associated 
congestion as 

C{x,y]uj) = ^ V 

The thesis then follows by choosing the path w^^^ in such a way to achieve the minimum in Eq. (jlOp 
and taking the limit (3 ^ 00. 

To get an upper bound, define the boundary dB of a configuration B, as the subset of couples 
(x, y) such that pf^{x,y) > and x £ B, while y ^ B. Notice that from Eq. (jlOp it follows that there 
exists a set -B C 5 \ ^ such that 

Ta= rnin [H(x) + V(x,y)] — mm H(z) . 
{x,y)<^dB zt^B 

The proof is completed by taking ip in Eq. (j26p to be the characteristic function of -B. □ 



C Proof of Lemma 14.21 

Construct the following partitioning of T into Ti = {vi,V2, ■ ■ ■ Vi^^-i}, T2 = {wjj , "Uij+i, • • • Wj2-i} 
■ ■ - Tr = {fv.i • • • Vk} in such a way that for every Tj = {vi^_-^ , • • • Vi.-i} and < I < ij, M{Tj,vi — 
1) = M{{vi^_^ ■ ■ ■ vi-i}, vi-l)<0 and for / = ij, M{Tj,vi - 1) > 0. 

Such a partition can be obtained the following way. Start with j = 1 and iteratively add Vi^s to 
the current set Tj. If M(Tj,Vi — 1) > 0, increment j and add Vi and the next vertices to the new 
subset. 

Let Tr = {vs, ■ • • ,vt} he the last subset in the above sequence. We claim that if M{Tr, k) < then 
M{Tr,i) < for all i > Vg- For every s < j < t and every i between vj and Wj+i by supermodularity 
M{Tr,i) = M{{vi, • ■ ■ Vj}, i) < M{{vi, ■ ■ ■ vj}, vj+i — 1) < 0. The same argument goes for vt < i < k. 
In that case the lemma is correct for T' = T,.. 

If M{Tr, k) > 0, we will show that the second statement of the lemma is true. For that, we need 
to write the H function for all sets Ti , • • • explicitly. For a set Tj and / = ij 



M{Tj,vi-l) 



cut(Tj, {1,2, ■■■vi-1}) - cut(Tj, {vi,vi + 1, • • - n}) + ^ /li 



ieTi 



> 0. 



(27) 



One can write a similar equation j = / by replacing vi — 1 with k. Equation (|12p gives a similar 
inequality for R. Adding up these inequalities for all j and R and noting that the contribution of 
every edge with both ends in UjTj U R cancels out, we get 



i-i 



M{T UR,k)>Y^ M{Tj,Vi^ - 1) + M{Ti,k) + M{R, 0) > 0. 



(28) 



□ 



D Proof of Theorem 13.41 

Proof. (Theorem I3.4p . Partition V into subsets Ri,R2,-" -,^1 by letting Vq = V and defining 
recursively 

Rt = argmin{cut(5, Vt\S) - \S\t^vt} 
Sent 

where Vt = V \ U*^\i?s and ^It is the set of all subsets S C such that Li < \S\h < L2. With an 
abuse of notation, we wrote hX* for h'^^^*^ {G{Vt) being the subgraph induced by Vt). Explicitly, for 

any jeVt, {h^*)j = hj + \dj\v\Vr 

Continue this process until no such set S can be found, and let i?^ = be the residual set. Notice 
that, since L2 > /imaxi we necessarily have \Ri\h < Li. By applying Eq. Q to F = GiVt), we have 

cut{Rt, Vt \ Rt) < \Rt\hVt +Li< \Rt\hVt + \Rt\h = \Rt\2h + cvLt{Rt, V\Vt). (29) 

Notice that cut{Rt, V \ Rt) - cut{Rt, V\Vt) = cut(U*=ii?s, V+i) - cut(U*^\i?^, V). By summing up 
this relation, we have, for all 1 < i < /, 

t 

CUt(U*=li?„ V \ Ui^^Rs) < \Rs\2h = 1 U*=i Rs\2h- 

s=l 

For each Rt, consider a linear arrangement of the induced subgraph that achieves its cutwidth. 
Construct a linear arrangement of V by concatenating the above linear arrangement of each Rt in 
the order t = 1,2, ... ,1. We will show that this ordering gives us the desired upper bound on the 
tilted cutwidth of G. Let S = uI^IRs U R where R C Rt for some t between 1 and /. Then 

cut{S,V\S) < cut{ul'l\Rs,V\ulz\Rs) + cut{Rt,V\Vt) + cutwidtli{Rt) 

< cut(U*-\i?„ V \ ulz\Rs) + cnt{Rt, V\Vt) + \Rt\h + + C 

< 2 cut(U*-^ii?s, V \ U*=\i?s) + L1 + L2 + C 

< 2\ulz\Rs\2h + Li + L2 + C. 

□ 



E Eigenvectors and barriers 

As in the last appendix, we consider here a general Markov chain with state space S, and let ^ C 5 
a subset of configurations. 

Lemma E.l. Let ipQ : S ^ H be the unique eigenvector of Pa with eigenvalue 1 — Ao,a o-iT'd assume 
(without loss of generality by Perron- Frobenius theorem) ipoix) > 0. Then there exists 6 > such 
that, letting B = {x € S : ipo{x) > b} , we have 

J_ Eix,y)edBl^i^)Pl3(.X,y) < ^ < E(x,y)edBl^i^)Pl3(.X,y) ^^^^ 
l^l Ea;eB^(^) - 0, _ ^^^Bfi{x) 

Proof. The upper bound follows immediately by substituting f{x) = l(x G B) in the variational 
principle ([26|) . 



In order to prove the lower bound, let = ^p^^^ < tp^^^ < • • • < ^'-^^ be the points in the image of 
'0o( • ) (obviously < S). For any {x,y) such that ipo{x) = V'o(y) = with i < j, we have 
{^po{x) - ^o(y))^ > Et/(^^'^^^ - i^^^^ f- Therefore, by letting Bi = {x € S : Vo(a;) > V'^'^}, we have 

TV 

Dir(V.o)>i;^(0(V'^'^-^^'"'¥, W{1)^ J2 Kx)pp{x,y). (31) 
i=i {x,y)edBi 

On the other hand, (V'W)^ < i J^Uii^^^^^ - If M{1) = li{x)l{ipo{x) = i^^^) = n{Bi) - 

N N N 

E(^g) = (^^'^)' ^ E (E^^W) (^^') - V'^'-'))^ (32) 

i=0 1=1 i=l 

Therefore 

which implies the thesis. □ 



